A  D  -  A  0 1 4  799 


ASUPT  (ADVANCED  SIMULATION  IN  UNDER¬ 
GRADUATE  PILOT  TRAINING)  AUTOMATED 
OBJECTIVE  PERFORMANCE  MEASUREMENT 
SYSTEM 

Wayne  L.  Waag  ,  et  al 

Air  Force  Human  Resources  Laboratory 
Brooks  Air  Force  Base,  Texas 

March  1975 


DISTRIBUTED  BY: 


ADA014799 


AIR  FORCE  SYSTEMS  COMMAND 

BROOKS  AIR  FORCE  BASE  ,TEXAS  78235 

\ 


NOTICE 


When  US  Government  drawings,  specifications,  or  other  data  are  used 
for  any  purpose  other  than  a  definitely  related  Government 
procurement  operztion.  the  Government  thereby  incurs  no 
responsibility  nor  any  obligation  whatsoever,  and  the  fact  that  the 
Government  may  have  formulated,  furnished,  or  in  any  way  supplied 
the  said  drawings,  specifications,  or  other  data  is  not  to  be  regarded  by 
implication  or  otherwise,  as  in  any  manner  licensing  the  holder  or  any 
other  person  or  corporation,  or  conveying  any  rights  ot  permission  to 
manufacture,  use.  or  sell  any  patented  invention  that  may  in  any  way 
be  related  thereto. 

This  interim  report  was  submitted  by  Flying  Training  Division,  Air 
Force  Human  Resources  Laboratory ,  Williams  Air  Fore  Base,  Arizona 
85224.  under  project  1123.  with  Hq  Air  Force  Human  Resources 
Laboratory  (AFSC),  Brooks  Air  Force  Base. Texas  7K235. 

This  report  has  been  reviewed  and  cleared  for  open  publication  and/or 
public  release  by  the  appropriate  Office  of  Information  (01)  in 
accordance  with  AFR  190-17  and  DoL'D  5230.9.  There  is  no  objection 
to  unlimited  distribution  of  this  report  to  the  public  at  large,  or  by 
DDC  to  the  National  Technical  Information  Service  (NT1S). 

This  technical  report  has  been  reviewed  and  is  approved. 

WILLIAM  V.  HAGIN,  Technical  Director 
Flying  Training  Division 


Approved  for  publication. 

HAROLD  E.  FISCHER.  Colonel,  USAF 
Commander 


•  M 

r  • 

* 

1 

•  .  a 

ir 

DISillJliTIM  milU  illr  COOES 

Mil.  *.u  ar  SflCIAl  1 

A 

I 


Unclassified 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  |W» Mn  Dmm  Enter ed) 


REPORT  DOCUMENTATION  PAGE 


*  TjTLf  f*ndSwl>fi*fej 


ASUPT  AUTOMATED  OBJ tCTIVE  PERFORMANCE 
MEASUREMENT  SYSTEM 


I  AuThOR/IJ 

Wayne  L.Waag 
Edward  E.  Eddowcs 


John  H.  Fuller,  Jr. 
Robert  R.  Fuller 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


i  recirient-s  catalog  number 


s  type  of  report  a  period  covered 
Interim 

July  1973  -  June  1974 


«  PERFORMING  ORG  REPORT  n  IMSER 


•  CONTRACT  OR  GRANT  »  IMBER<*J 


'0  PROGRAM  ELEMENT  PROJECT  TASK 
AREA  •  WORK  UNIT  NUMBERS 

62703F 

11230108 


I*  REPORT  DATE 

March  1975 


*  PERFORMING  ORGANIZATION  NAME  ANO  ADDRESS 

Flying  Training  Division 

Air  Force  Human  Resources  Laboratory 

Williams  Air  Force  Base,  Arizona  85224 


It  CONTROLLING  OFFICE  NAME  ANO  ADORES' 

Hq  Air  Force  Human  Resources  Laboratory  (AFSC) 
Brooks  Air  Force  Base.  Texas  78235 


MONITORING  AGENCY  NAME  A  AODRESSfif  dllloront  from  Controlling  Oitico)  15.  SECURITY  CLAS'a.  (of  thi»  topart) 

Unclassified 

IS*.  OECL ASSiFICATION  DOWNGRADING 
SCHEDULE 

Ts  distribution  statement  toi  u>i*  Report) 


Approved  for  public  release ;  distribution  unlimited. 


>7.  DISTRIBUTION  STATEMENT  rol  Ibe  •blit  me  I  • ntered  In  Block  30,  If  dllfernnl  from  Report) 


'*  KEY  RDRDS  (Continue  on  re-et it  tide  II  necettary  mid  Identity  by  block  number) 

pilot  performance  measurement 
proficiency  measurement 
pilot  evaluation 
objective  measurement 


20  ABSTRACT  (Continue  on  reverte  tide  If  neemetmry  and  Identify  by  block  number) 

To  realize  its  full  research  potential  a  need  exists  for  the  development  of  an  automated  objective  pilot 
performance  evaluation  system  for  use  in  the  Advanced  Simulation  in  Undergraduate  Wot  Training  (ASUPT) 
facility.  The  present  report  documents  the  approach  taken  for  the  development  of  performance  measures  and  also 
presents  data  collected  from  two  preliminary  evaluation  studies.  The  results  indicated  that  the  objectively  derived 
measures:  (1)  correlate  highly  with  instructor  ratings,  and  (2)  discriminate  between  pDots  of  different  experience 
levels  These  findings  are  encouraging  and  demonstrate  the  potential  of  the  present  approach  for  generating  the 
needed  automated  objective  pilot  performance  measurement  system. 


DD  !  jan ^73  1473  EDITION  OF  I  NOV  SS  IS  OBSOLETE 


•  Unclassified 

| ,  SECURITY  CLASSIFICATION  OF  THIS  PAGE  fBfc*n  Dm  Entered) 


SUMMARY 


The  Advanced  Simulation  in  Undergraduate  Pilot  Training  (ASUPT)  facility  is  designed  to  be  a 
research  device  capable  of  providing  answers  regarding  the  hardware  design  and  effective  use  of  lliglit 
simulators.  Using  state-of-the-art  motion  and  visual  systems,  the  relationship  between  simulator  fidelity  and 
training  effectiveness,  as  well  as  the  applicability  of  advanced  training  concepts  are  to  be  investigated.  Since 
ASUPT  was  designed  to  be  a  research  simulator,  the  development  of  an  adequate  performance  measurement 
system  becomes  the  foundation  of  the  proposed  program  of  research. 

Approach 

One  of  the  salient  characteristics  of  flying  is  that  it  is  criterion-directed.  The  execution  of  any 
maneuver  requires  that  a  certain  definable  objective  be  met.  The  degree  to  which  these  objectives  are  met 
would  appear  to  represent  an  adequate  description  of  performance.  In  other  words,  a  criterion-referenced 
approach  to  objective  performance  measurement  is  proposed.  Consequently,  for  each  maneuver,  the 
criterion  objectives  must  be  defined  in  terms  of  parameters  available  within  the  simulator.  Usin<>  this 
approach,  a  set  of  performance  measures  can  be  generated  for  each  of  the  maneuvers  to  be  flown  in 
ASUPT. 

Results 

To  evaluate  the  potential  of  the  proposed  approach,  two  preliminary  studies  were  conducted. 
Measures  were  generated  for  seven  basic  instrument  maneuvers  of  varying  levels  of  difficulty.  Pilots  of 
different  experience  levels  flew  these  maneuvers  and  were  evaluated  by  experienced  instructor  pilots.  The 
results  indicated  tl  at.  M)  instructor  pilots  were  consistent  in  their  subjective  evaluations. (2)  the  objective 
measures  correlated  highly  with  the  subjective  evaluations,  and  (3)  the  objective  measures  discriminated 
between  pilots  of  different  experience  levels. 

Implications 

The  data  suggest  the  approach  taken  to  be  a  viable  one.  The  basic  assumptions  of  the  measurement 
scheme  were  corroborated  by  the  data.  The  results:  (1)  suggest  instructor  evaluations  represent  a  useful 
criterion  for  developing  objective  measures.  (2)  indicate  the  objectively  derived  measures  possess  a  high 
degree  of  validity,  and  (3)  provide  some  insight  into  the  manner  in  which  instructors  assign  grades. 
Potentially  fruitful  areas  of  further  research  are  discussed. 


^  "  Jl'  ■  11  . . . '  lim.iii.  u  i  nipii 


PREFACE 


This  document  represents  a  portion  of  the  research  program  on  Project  1123, 
Mytng  Training  Development.  Dr  WiBiam  V.  Hagjn.  Project  Scientist;  Task  1 1230 K 
Uevelopmertt  of  Performance  Measurement  Techniques  for  Air  Force  Hying  Training,  Dr 
Wayne  L.  Waag,  Tack  Scientist,  being  carnet!  out  by  the  Air  Force  Human  Resources 
Laboratory,  Flying  Tsaining  Division.  Wiliam  Air  Force  Base,  Arizona. 


2 


MmL 


■ . . . . . . 


TABLE  OF  CONTENTS 


I.  Introduction  .  3 

Measurement  System  Requirements .  5 

Measurement  Development  for  ASUPT  .  5 

Preliminary  Evaluation  -  Study  I .  6 

Preliminary  Evaluation  Study  II  .  9 

Implications  . 

References .  ** 

LIST  OF  TABLES 

TUMe  Ps** 

1  Comparison  of  Instructor  Judgments  at  Cockpit  and  Console 

For  Straight  and  Level .  ^ 

2  Correlations  of  Objective  Performance  Measures  With  Inst  ructor  Evaluations 

for  Straight  and  Level  . .  ? 

3  Prediction  of  Overall  Instructor  Pilot  Ratings  For  Straight  and  Level  .  8 

4  Prediction  of  Smoothness  Ratings  For  Straight  and  Level .  8 

5  Inter-Rater  Correlations  for  Study  II  .  9 

6  Correlations  of  Objective  Measures  with  Instructor  Ratings  for  Study  II .  11 

7  Descriptive  Statistics  of  Objective  Measures  for  Each  Rating  Category .  12 

8  Comparison  of  Objective  Measures  for  Experienced  and  Inexperienced  Pilots  .  13 


ASl'PT  Al  TOMATKDCfUHTIVfc  PF.RFORMANCF 
MF.ASURFMP.NT  SYSTFM 


I-  INTKODtcmiN 

The  Advanced  Simulation  in  Undergraduate  Pilot  Training  <  ASUPT  >  facility  is  designed  to  be  a 
rescar-.h  device  capable  of  providing  answers  regarding  both  the  hardware  design  and  effective  use  of  flight 
simulators.  Using  state-of-the-art  motion  and  visual  systems,  the  relationship  between  simulator  fidelity  and 

S'  'Z?! ?  teCTn<*Vv  SF  7 S,CmS-  JS  WC"  *  thc  »W*«Mity  of  advanced  training  concepts  arc  to 
in  estigaud  Since  ASl  PT  is  designed  to  he  a  research  simulator,  thc  development  of  an  adequate 

LtZu  ™UrCmC"'  SVS,en;  component  of  the  research  program,  fhis  report 

documents  the  approach  to  performance  measurement  system  development  which  has  been  taken  and 
presents  the  results  of  two  brief  validation  studies. 

Measurement  System  Requirements 

The  criterion  for  evaluating  a  flight  simulation  device  is  its  training  effectiveness.  From  past  evidence 
indicating  positive  transfer  effects,  it  is  assumed  that  performance  in  the  simulator  will  be  positively  related 
to  performance  in  the  aircraft.  As  a  result,  the  thrust  of  the  present  effort  is  to  develop  measures  which 

rSr^rf°mian.0e  h  S‘mUia,°r  ,n  addi,ion-  '*  «  P°**ble  to  use  pilot  performance  in  the  simulator  as 
a  entenon  against  which  to  investigate  alternative  simulator  hardware  configurations  and  training  strategies. 

One  of  the  salient  characteristics  of  flying  is  that  it  is  criterion-directed.  For  the  execution  of  any 
maneuver  or  sequence  of  maneuvers  there  are  definable  objectives  which  must  be  accomplished.  The  degree 
o  which  these  objectives  are  met  represents  an  adequate  description  of  performance.  In  other  words  a 
cntenon-referenced  approach  to  pilot  performance  forms  the  basis  for  the  present  effort.  Following  such  an 
approach,  it  »  apparent  that  the  definition  of  criterion  objectives  is  of  foremost  importance.  Within  the 
context  of  measurement  development  for  ASUPT.  the  critical  question  is  whether  these  criterion  objectives 
can  be  stated  in  terms  of  parameters  available  within  the  simulator.  In  other  words,  can  behavioral 
objectives  be  defined  in  terms  of  the  st.rie  of  the  simulated  aircraft  and  control  inputs  of  the  pilot? 

Aside  from  the  requirement  to  define  performance  in  terms  of  observable  behavioral  objectives  there 
W  kCh  arC  app,icab,c-  11,0  f,rst  is  parsimony  in  the  selection  of  simulator  parameters  to 
d  Cn  C?°"  ?bjec!,VesJare  *°  ^  dcf,n'd  usinP  as  few  parameters  as  possible.  Furthermore,  they 
?  *  samP,ed  and  ana,y7£d  on  a  real-time  basis  so  that  the  resulting  measurement  and  feedback  are 

mmed'ate.  Since  measurement  will  be  an  integral  part  of  training  students  in  ASUPT.  it  is  also  necessary 
that  the  resulting  output  be  meaningful  and  easily  interpreted  by  both  instructors  and  students. 

.  « .  ,n  summary,  a  criterion-referenced  approach  to  measurement  system  development  is  to  be  pursued 
within  thc  constraints  of  the  following  requirements: 

1 .  Measure;  will  assess  thc  degree  to  which’  thc  criterion-objectives  are  met. 

2.  Measures  will  reflect  only  the  most  salient  characteristics  of  performance. 

3.  Measures  will  be  meaningful  and  interpretablc  to  the  user  the  student  and  instructor  pilot. 

4.  Measures  will  be  generated  on  a  real-time  basis  so  that  feedback  is  immediate. 

Measurement  Development  for  ASUPT 

rr  ^  Altbouf’  the  criterion-referenced  approach  represents  the  rationale  for  present  developmental 
efforts,  a  further  set  of  assumptions  have  guided  implementation  on  ASUPT.  P 

1.  Measurement  system  development  parallels  skill  acquisition  in  the  student  pilot.  Since  student 
pilots  acquire  flying  skills  in  a  hierarchical  fashion,  measurement  development  should  proceed  in  a  similar 

,”ock  ■"-*  15  <c  s*  ass 

:,rric;: jsie&k.  —  -  »r 


Preceding  page  Mask 


«undJ(L,’!rZ!TJ7  t  < ' •HctptutUzed  as  mtegmted  sequences  of  ucmiv  states  and  tmturirms.  The 
,  n.jehi  altitudes  plus  innstions  from  one  attitude  to  another  form  the  conceptual  segments  for 

onjcctiws  for  caJi  segment  since  they  arc  likely  to  differ  from  one  to  another 

fhT,"nrCe  Cm  ^  eVahHaed  hY  -»  °f  parameter*  those  reflecting  the  state  of 

mfT  °f  ,HC  ***  SMptri0r  Pcrfwrnttn«  *  *®«"cd  lo 
SSJenW  ^  CTri0n  °‘*C,iw  *  def,ncd  *  ■*><  *wft  Me 

a^i  r  Tccon  ohsi^.^Z^"  «***«*"«  forces  so  that  the  maneuver  is  executed  smoothly, 
andjcl  acLomphshtng  these  Actives  w„h  the  least  amount  of  effort,  that  is.  by  minimum?  control 

m//r  iJ^ideZfZu"'^  PeTf,"mmte  J<’r  a  *he*  Porter  waives  a  comparison  of  the  obtained  ndue 

Tndex  h  ,  *"**  *«'  ,hc  ***»  from  the  ideal  provides  an 

m  r  Since  the  ideal  is  seldom  attained,  it  is  more  resist ic  to  define  acceptable  Derformarw-e  i» 

erms  of  an  empmcally  determined  tolerance  band  about  the  ideal  value  For  parameters  reflectini 
experienced  "*  ad3P‘Cd  rCpresen,s  *  pCrform>rlce  kvc>  characteristic  of  the highly 

i  r-  :.6  nf  imly‘fmentatinn  <>f  the  measurement  system  requires  four  phases  of  develanmmt  r„i 

pro'^d"  «[  7/ ^h"  °f  3 c,ndftt  **  of  »«™»*tor  parameters,  (b)  evaluation  of  the 
P  posed  set  o!  measures  for  the  purpose  of  validation  and  simplification,  (c)  specification  of  criterion 

performance  by  requiring  experience  instructor  pilots  to  fly  the  maneuver  in  auestinn  and  rvn  ih* 
colleclioti  of  item,*,,  da,  -ring  sindeni,  a  lhe,  ^  ‘ 

Fret  minify  Evaluation  -  Study  I 

,  ,ln  ^P*"8  with  the  approach  outlined  previously,  the  first  maneuver  for  which  measure*  were 

wen.c;L-:p: srSTtfS  ™ t,”eof  z 

^SSS^SSSBJSSSS^^ 

miJTBStssir;: tsXiSr  *h»"re  *°  q— — 

-i-j  «  -tCiaiss  ssasKr^sftsa-s 

rft^"S'iry^dr^y  ^  *•  rep'M"  in  *» —  -  is 

HonjTlK,^  i2  (flying  Training  Division,  Ah  Force 

sr-  -  - . <  « 

i»«  i°Kh  ,hd7Tl,tr' i,c  r"™M  <■  t.m.  i. 

rr  sa  ssri 

criterion  .grins,  rihfch  in  Slid*.  rta  ,h*  ““W  *°  >*  *  «W* 


fjMi  }  C  nmpanwm  of  In  «  met  or  Judgments  al  Cockpit  and  (  unvote 
For  Straight  and  IjcwI 


Mnwi 

vtafi 

CachPft 

Motn 

Comal* 

Altitude  (  .’lltii'l 

4  ~4^ 

afoti 

Heading  (  niuml 

4  ■’(,2" 

4  (>'<>2 

Airspeed  Control 

4  >424 

4.~  144 

Overall  Rating 

4  *»2"'l 

4  6"|‘> 

Smoothness 

4  4>'4 

4  6250 

IMn-Niltr 

CtinlUioii 


.8722 

.‘>257 

.8535 

9 1 09 
6  2‘>7 


T!u-  next  step  was  to  determine  whether  the  obiectivclv  derived  measures  would  rdiahly  predit,  the 
instructor  ratings.  The  adopted  cniena  were  the  oserali  rating  and  smoothness  rating  Table  2  presents  the 
correlations  between  r Ire  obtectrvels  denved  measures  ot  performance  and  these  instructor  evaluations  for 
both  the  cockpit  and  console  talers  \  glance  al  these  results  suegest  several  things.  I  rrsi  the  correlations 
between  the  measures  and  instructor  evaluations  ire  lauh  consistent  tor  both  the  cockpit  and  console 
raings  Second  and  most  important!),,  suhsiantial  relationships  exist  between  3  number  of  the  objectively 
denved  measures  and  the  instructor  pilot  I  IPl  s..h(ect,ve  evaluations.  Tlurd.  the  measure  RMS  vertical 
w.  xity  predicted  both  cnteria  t|uiie  well.  Iliese  findings  were  highly  encouraging  suggesting  that  these 
objectively  denved  measures  do  relate  to  the  instructor  s  evaluation  ol  performance. 

liihk  2.  C  nrrdations  of  Objective  Performance  Measures 
With  Instructor  Evaluations  for  Straight  and  Level 


Manure 


Overall  Katin* 

Cockpit  Console 


Smoothness  Katina _ 

Cockpit  Console 


Mean  Altitude  hr  rot  64‘>2 

Mean  Airspeed  Error  5K2b 

Mean  Heading  Error  5075 

RMS  Altitude  7)S03 

RMS  Airspeed  JtPX) 

RMS  Heading  .(i3‘>3 

Mean  Stick  Force  .2740 

RMS  Stick  Force  44  34 

RMS  Stick  Movement  ,00‘>X* 

RMS  Throttle  Movement  164^* 

RMS  Pitch  Rate  jj()H 

RMS  Fitch  Acceleration  .078 1  * 

RMS  Roll  Rate  0371* 

RMS  Roll  Acceleration  .1527* 

RMS  Vertical  Velocity  .7737 

RMS  Vertical  Acceleration  .4 1 72 


*  Nonsignificant. 


•7238  .4645  -.4834 

<’'15  .4365  .4331 

5134  ,472f>  5-153 

•h;4h  5750  607) 

.5678  .6)93 

6498  .5931  6941 

3X59  .3717  .3560 

•3101  .3249  2639 

•0237*  .2789  .3369 

•  |H55*  .3320  .3028 

2625  5019  .5036 

1277*  .2006*  -.1586* 

•0500*  .0365*  .0931* 

.0976*  .0145*  .0009* 

•  7560  .7004  _  7277 

•3770  .4663  .5194 


Using  a  forward  selection  multiple  regression  procedure,  subsets  of  variables  were  selected  which  were 
predictive  of  the  entenon.  An  iterative  procedure  was  used  wherein  variables  were  added  to  the  prediction 

vSlesTth  ^  *ncrerncnt  u  explained  variance  became  statistically  nonsignificant.  At  this  Joint,  the 
variables  in  the  prediction  equation  were  eliminated  from  the  predictor  set  and  the  procedure  repeated  In 
this  manner  multiple  sets  of  predictors  were  defined.  The  criteria  adopted  were  the  overall  rating  and 
smoothness  rating  obtained  from  within  the  cockpit.  Using  the  equations  developed  against  the  cockpit 

presented  •atTb!rt  "a  Tt '°  th*  ”ting* obtained  al  the  conso,e-  ^  results  of  these  analyses  ™ 


7 


Tabk  J.  Prediction  of  Ovnal  imtnactor  P9ol  Ratht* 
Foe  Strai^M  and  Level 


Mown 

Cm*  an 

Ctmtit 

RMS  Altitude 

Set  1 

,m6 

.9044 

Mean  Altitude  Error 

Mean  Stick  Force 

RMS  Throttle  Movement 

RMS  Vertical  Velocit> 

Set  2 

.8839 

.8867 

RMS  Airspeed 

RMS  Stick  Movement 

RMS  Pitch  Rate 

RMS  Heading 

Set  3 

.8303 

.8178 

Mean  Airspeed  Error 

RMS  Stick  Force 

RMS  Pitch  Acceleration 

RMS  Vertical  Acceleration 

Mean  Heading  Error 

Set  4 

.5075 

.5134 

Table  4.  Prediction  of  Smoothness  Rating* 
For  Straight  and  Level 


Mniwt 

Cackptl 

Carnal* 

Set  1 

RMS  Vertical  Velocity 

RMS  Throttle  Movement 

.7926 

.8005 

RMS  Altitude 

RMS  Pitch  Rate 

Set  2 

RMS  Heading 

Mean  Altitude  Error 

.7155 

.7086 

RMS  Pitch  Acceleration 

Set  3 

RMS  Airspeed 

RMS  Vertical  Velocity 

Set  4 

.6718 

.7053 

Mean  Heading  Error 

Mean  Airspeed  Error 

RMS  Stick  Movement 

Sets 

.6248 

.6496 

Mean  Stick  Force 

RMS  Stick  Force 

.5585 

.5277 

8 


Four  subsets  of  variables  wert  selected  which  were  predictive  of  the  overall  ratings.  The  first  subset 

mCT  ilti,Ude  errur’  ,nean  stick  force-  and  RMS  Prattle  movement  yielded  a 
multiple  &  of  9066.  The  correlation  between  the  predicted  score  (using  the  equation  developed  for  the 
cockpit  rating)  and  the  console  rating  was  .9044.  Similar  degrees  of  correspondence  were  obtained  for 
remaining  subsets  of  predictions.  For  smoothness,  five  subsets  of  predictors  were  identified  Again  high 
degrees  of  correspondence  were  obtained  between  the  multiple  Rs  developed  for  the  cockpit  ratings  and  the 
subsequent  correlation  between  predicted  scores  and  the  console  ratings. 

^  re*u,ts  of  the  firs'  s,ud>  wcre  hiK,l!y  encouraging  and  seemed  to  warrant  the  following 
conclusions  First,  instructor  pilots  arc  highly  consistent  in  their  evaluations  of  performance.  Consequently^ 
it  b  possible  to  use  such  evaluations  as  one  criterion  against  which  to  validate  objective  measures  of 
performance.  Second,  'he  objective  measures  of  performance  developed  for  straight  and  level  will  reliably 

.‘ThT1  T*trUClu  cva,u!tiorns  J1*  demonstration  of  such  predictive  validity  suggested  the  approach  taken 
to  be  a  reasonable  one.  To  further  evaluate  the  proposed  system,  a  second  study  was  undertaken. 


Preliminary  Evaluation  -  Study  11 

Scenarios  were  developed  and  the  performance  measurement  software  written  for  the  following 
maneuvers:  change  of  airspeed;  constant  airspeed  climbs/descents;  rate  climbs/descents;  and  the  steep  turn 
Similar  rating  forms  were  developed  for  each  maneuver  and  the  simultaneous  cockpit/console  evaluation 
procedures  followed.  Four  T-37  instructors  were  the  raters-two  alternating  at  the  console  and  the  other 
two  altemalwg  in  the  cockprt.  Ten  subjects  were  used  in  the  second  study,  again  representing  a  wide  range 
of  skills.  Four  student  pflots  in  T-37  training,  three  T-37  IPs,  and  three  civilians  were  included.  Each  subject 
new  the  following  set  of  maneuvers:  three  airspeed  changes;  one  constant  airspeed  climb;  one  constant 
airspeed  descent;  one  rate  climb:  one  rate  descent;  and  three  steep  turns.  For  each  climb/descent  a  level-off 
to  altitude  was  required. 


Inter-rater  corrections  were  computed  for  each  maneuver  and  arc  presented  in  Table  5.  As  indicated 
the  data  for  climbs  and  descents  were  pooled.  Overall  inter-rater  correlations  were  computed  for  categories 
which  were  rated  for  all  maneuvers.  The  data  indicates  substantial  agreement  among  the  raters,  especially 
Tor  the  overall  and  smoothness  ratings,  even  though  the  values  were  somewhat  less  than  obtained  for 
straight-and-level.  Several  possible  reasons  for  the  lowered  inter-rater  correlations  should  be  mentioned. 
First,  the  maneuvers  in  the  second  ;tudy  were  of  increased  complexity.  Since  these  maneuvers  require 
several  transitions  in  addition  to  a  stead>  state  condition,  the  instructor’s  job  of  monitoring  all  the  relevant 
parameters  is  increased.  Likewise,  the  performance  of  transitions  from  one  steady  state  to  another  increases 
the  number  of  cues  available  to  the  rater  within  the  cockpit.  A  second  possibility  concerns  taler  bias.  It  is 
possible  that  different  suojective  criteria  were  used  in  ratings  of  the  T-37  IPs  as  opposed  to  students.  An 
examination  of  the  ratings  of  one  of  the  IPs  performance  records  yielded  large  discrepancies  between  the 
cockpit  and  console  ratings.  The  objective  measures  appeared  to  agree  with  the  cockpit  ratings  in  that  the 
performance  was  quite  good.  However,  according  to  the  console  rater,  the  performance  was  considered  to 
be  unsatisfactory.  Such  data  strongly  suggest  the  possibility  of  rater  bias. 


Table  5.  Inter-Rater  Correlations  for  Study  II 


Chant*  of 
Mrs**#* 


CAS 

Cllmh/D*ic*nt 


Hitt 

Climb/D*sc*nt 


SIMP 

Turn  Ovarall 


Altitude  Control 

.7714 

.6334 

.7740 

Airspeed  Control 

.7763 

.4818 

.8078 

Heading  Control 

.6121 

.8987 

.7534 

Rate  Control 

+ 

+ 

6272 

Bank  Control 

+ 

+ 

+ 

Overall 

.7741 

.6822 

.8045 

Smoothness 

.7311 

.7055 

.9661 

♦  Not  computed  for  maneuver. 

*  Not  computed  since  item  not  rated  in  nil  maneuvers. 


.6935  .7279 

.4238  .6758 

+  * 

+  * 

.7743  * 

•7721  .7716 

.8592  .8368 


9 


■ 


. . 


Correlations  between  each  of  the  objective  measures  and  the  overall  and  smoothness  ratings  for  each 
maneuver  were  computed  and  presented  in  Table  6.  In  this  case,  the  ratings  from  the  IP  in  the  cockpit  were 
used  as  the  criteria.  A  perusal  of  the  data  warrants  several  conclusions.  As  expected,  parameters  reflecting 
performance  of  the  criterion  objectives  were  most  related  to  the  ratings.  Ukewise,  the  RMS  values  about 
the  ideal  were  correlated  more  highly  than  mean  deviations.  The  only  exception  was  RMS  bank  error.  In 
this  case,  an  error  in  the  computing  software  was  discovered,  thereby  invalidating  the  resulting  measure. 

A  forward  selection  regression  analysis  was  computed  in  an  attempt  tc  develop  prediction  equations 
for  the  overall  steep  turn  rating.  The  steep  turn  was  selected  since  it  represented  the  most  difficult  of  the 
maneuvers  tested.  The  initial  subset  of  seven  variables  selected  by  the  procedure  yielded  a  multiple  £.of 
.8820.  This  equation  was  then  used  to  predict  the  console  ratings.  Ti  c  obtained  correlation  between 
predicted  and  observed  console  ratings  was  .9137.  Again,  multiple  subsets  were  isolated  which  were 
predictive  of  the  criterion.  Although  not  verified,  it  seems  likely  that  similar  sets  of  prediction  equations 
could  have  been  developed  for  the  other  maneuvers.  In  any  case,  it  is  certainly  clear  that  the  objective 
measures  developed  are  highly  predictive  of  instructor  ratings,  thereby  demonstrating  the  validity  of  the 
present  approach  to  measurement. 

A  further  set  of  analyses  were  computed  in  order  to  relate  the  objective  measures  to  the  instructor 
evaluations.  Performance  records  for  all  maneuvers  were  pooled  and  placed  into  groups  according  to  the 
evaluation  of  tl  ockpit  instructor.  Four  groups  were  defined  according  to  the  grade  assigned  of  U,  F,  G, 
or  E.  Descripti  ».  statistics  were  then  computed  for  each  of  the  four  groups.  The  results  are  presented  in 
Table  7.  Measures  for  rate  of  climb  and  bank  were  deleted  due  to  the  small  number  of  cases  within  each  of 
the  groups.  An  examination  of  the  data  warrants  several  conclusions.  First,  it  is  apparent  that  there  are 
clearly  defined  trends  for  a  number  of  the  objective  measures  across  the  four  groups.  There  are  clear  cut 
decreases  in  root  mean  square  for  airspeed,  heading,  and  altitude  as  a  function  of  the  subjective  ratings, 
likewise,  there  are  decreases  in  the  variability  across  subjects.  Consequently,  lowered  ratings  are 
characterized  by  increasing  within-subject  error  as  well  as  increasing  betwcen-subject  variability.  Such  data 
reflect  the  fact  that  there  is  one  way  to  execute  the  maneuver  correctly,  but  many  ways  to  commit  errors 
and  therefore  receive  a  lower  evaluation. 

The  results  also  verified  the  assumption  that  superior  maneuver  performance  involves  the 
minimization  of  control  inputs.  Performances  rated  excellent  were  those  which  minimized  the  amount  of 
stick  movements  and  also  minimized  stick  deviation  from  the  null  position.  Furthermore,  superior 
performance  was  also  characterized  by  minimum  amount  of  control  force,  or,  in  other  words,  the  efficient 
use  of  trim.  While  stick  inputs  were  found  to  be  related  to  rated  performance,  throttle  movements  were 
not.  Of  the  proposed  measures  of  smoothness  only  pitch  rate,  vertical  velocity  and  vertical  acceleration 
were  related  to  rated  performance.  Again,  the  pattern  of  decreasing  means  and  variances  was  found. 

As  discussed  previously,  a  requirement  of  the  measurement  system  is  that  it  reliably  discriminates 
between  pilots  of  different  experience  levels.  In  this  case,  depending  on  their  previous  flying  experience, 
subjects  were  placed  into  one  of  two  categones  high  versus  low  experience -and  descriptive  statistics 
computed.  The  results  are  presented  in  Table  8.  It  should  be  pointed  out  that  the  statistics  computed  for 
each  objective  measure  were  based  on  data  resulting  trom  all  the  maneuvers  in  which  that  measure  was 
computed.  For  example,  the  statistics  for  heading  measures  were  based  on  all  maneuvers  except  the  steep 
turn.  The  results  indicate  that  a  substantial  number  of  the  objective  measures  will  differentiate  between  the 
two  groups.  Genera  ly,  the  inexperienced  group  was  characterized  by  higher  error  scores  and  much  greater 
ariability.  For  the  steady  state  parameters,  the  RMS  error  scores  were  more  discriminative  than  the  mean 
error  scores. 

Implications 

The  results  of  these  two  evaluation  studies  warrant  a  number  of  conclusions,  first,  the  data  suggest 
the  approach  taken  to  be  a  viable  one.  The  assumptions  concerning  superior  maneuver  performance  have 
been  corroborated  by  the  data.  Experienced  pilots,  and  likewise  those  performances  rated  excellent,  were 
characterized  by:  (I)  meeting  the  criterion  objectives  in  terms  of  minimizing  aircraft  state  errors,  (2) 
mi:  ri  mi  zing  rates  and  accelerations  at  least  in  the  pitch  and  Z  axes,  and  (3)  minimizing  control  inputs.  In 
other  words,  superior  performance  involves  getting  the  job  done,  doing  it  smoothly,  and  with  a  least 
amount  of  effort. 


10 


. *’  . 1  . - — i-»J 


Tabic  6.  Correlation*  of  Objective  Measure*  with  Instructor  Ratings  for  Study  II 


^  n  n  n 

•ri  tr,  \£  n 


-  in  m  ^  a  —  -  ^  so 

vo^foo  r-  so  r-  rimfnr,fs«- 

r-  —  —  —  ^oov-i— < 

— •  Tf*  s©  —  r^h\Of»OOtvt 

r  r  r  r  r  r  r  r  r 


• 

r-  O  ^ 

r~  ,  .  oc  rn  r-  ac 

O  r  T  ^  m  n 
vO  id  Vi  C 


r<»nr^i^<NQ«ni^'fif^v>oo 
4.oor-aoo'r4oor'aoov*>r4<N 
TriTfhrr)^r,  nninj'iNvi 


m  -*  O  O'-  oo  ^  Or)f<ifri'0<N-f^^M> 

ri  O'  O  o  ,  n«-^-o^^oovo^no 
vi  \S  oo  -  -  ^  ttnmor«*jo^Qr^—^fr- 
O  ^  r-  t--  o  r*  ^xO'Or^^^’tocr^o 


Soo  w-i  *n 
—  or- 
r*  ^  n  a 
O  O  vC 


-'COO"P 
x  r-  r-  ^ 
^  r  vo  vo  6 

O  ^  VC  'C  O 


^'O'O'O'OO'— O^^^t-OvO 
,  O'  fNo,finrnom«--ONh<*) 
*r  e-0'-0'XrxvOf«Mn^,x  —  — 

fO  X  O ’t  'O  -*  fM  sc  v©  v>  O  —  v>  r- 

f  f  r  r  r  r  r  r  r  r  *  r  r  r 


O  <N  ^  IA 

r-  o  O'  »-*  r  vi 

Smt  so  n  r 
O  O  m  rn  ^ 

\  r  *  r  r 


qv^w.-Ntrjp,tO"Cxr*fs 
.  ,  .  *  O'^’OXO'Q'n^^rvOX 
+  +  +  ♦  M^5oNQ-«o^r»n 
O-n-riO-OOn'tn 


r*  —  ri 

r  *  r 


S^r-OOO  — tr^scvn— r^O^tO'VO 

O  m  ^  n  O'  ,  ,  .  .  nr'e-fN^t'isov)ploCMnx 

-•-ifiO^'C  +  +■  +  + 

O^ovimw  -NMn-'O-OOfnnm 


0^0  vjhvi 

r  r  ‘  f  r 


•“  (N  M  r^i 

f  r 


v©  >©  ^  \©  cj 

rj  vj  r-  ^  -  . 

^  oo  **■>  •—  m  *n  + 

O  vO  -  ^  vO 


0'^rfli,'tVi(NwN^O^ 

vionro'rr-*  ©  —  cn  ^ 

S^ao«Nrn«r^^rifM  —  so 
—  rj  o  n  rf  ^r.  o  —  rvi  \o 


'  M  O  M  ^ 

f  r  r  r 


<N  SO  VI 

\  \  \ 


oo 

Vi  VlQ  'C  t  00 

vo  oo  oo  ^  r-  O 

-  Vi  f »  Vi  ^  f- 

r  \  ’  f  r  r 


Mxrfn^h*«n«r^  —  ao 

4_  4.  4.  'tO'5^'C'5^XOp-«fn 

-  n  o  p  n  n  -  -  o  m  '6  ^ 

r  f  r  *  r  f  r  ’  r  •*  r  r 


6  s 

*  i 


S  g  §: 

o  S  O 

£  35 
■x  ^  ^ 


lflJ5sH«asas 

=  «£>  5  £  £  to  g  to  5««  5  </5  c/3 

iliiiilililli 


I  I 
I  I 

ill 

£  oo 

J3c£ 

i  (A  CO  V) 


St 

!*■ 

o-?> 

-2<3 

111: 

HI; 


S 

w. 

t» 

£ 

« 

1 

2 
o 

S 

<o 


iri 

^  >C  C  ^  —  »  n  r<  w  n  n  ij  m  O  —  «  ^ 
in  r*-k  r*  p  Is,  ri  r*  O  «r.  ©  Q  *.  **>  *n  ^ 

~  ~  ‘  aV  —  O'  ‘V 


^O^^O-OOri^pn^NflOO^P'^p' 

»*»  n  n  ^  i  o  -  F  m  r  i  h  ^  h  h  m  m  h 

OC  ac  a  *■*1  ^  O  O  <Oirtrr, 

O  ©  i***  «c  o  £  ©  ^  >c  —  ao  *■  cn  —  r- 

^  «  *  -  5;  $  -  - . a  3 


uir*i»r>f-i»r>ri  —  »- 

n  qc 


«  r^  r  **acr^f*i***. 


»n««»r,  riMrQnj  sfi  »n  h*  »n  h*  n  ac  f  < 

?»^^.g-M«'na^Qh-a'C'7'  — 

^  n  -  ^  ^  o 

O^OtO^f^'O  —  r*»r^  —  —  —  m  <•  ae  n 

•ri  r-vpj  r-  -  -°  *  - 


O  >£  O  **>  r^  r*%  r-  — in  —  naOsO  —  —  o  fN  «r> 
—  ©  ri  ~  ©  c*-.  —  ©  &•  wn  ^ 

^©r-r^— ^x— o  ^^x-hoc-pj 
x  O  O  r-in  —  \c^— r^o  ri  —  r»*  »/%  ^  x  ^, 


p  a.  ^i  —  \c  *• 

r*i  l*  m l  ri 
ri  ^ 


—  K  in  c  X  ^ 
—  —  r-'  rrj 

—  fl 


N  «  M  it. 


<  r*“.  — 
T  ^  | 


ci  p  r^*  «/*.  ri  r^  ©  —  «✓-.  o  r*i  ir.  o  —  ©Or- 

O  »n  ^  r*n  ^  n  r  P  0  'n  r,  n  t  p  x  ^ 

ri  q  x  O  *t  P  r|  r|  ^  r|  ^  v  m,  O 

<£  ri  —  ad  —  'C  r  *  *si  ‘  r*i  ^  O  iri  *** 

-  si '  r‘  ?. 


Ill 

III 


Vi  "  o 

sll 

"3  ,•* 

*<«! 

t/5  C  [/) 


S i||  i  it! 

Z  5»  o  ^  E  «  5  ** 

iilJiijM?! 

3-5-8  2  ti^^  =  =  zz 

J££c§c 2^> 

S</)«ii/i^i/;i/M/]i/!i/)(/)i/i 


12 


Table  8.  Companion  of  Objective  Measures  for  Experienced 
And  Inexperienced  Pilots 


Maasura 

UpKlNM 

ImxpnriaiicMI 

s.a 

Main 

S.O. 

Mean  Airspeed 

1.4730 

2.5232 

2.3486 

6.1876 

RMS  Airspeed 

3.4929 

2.0878 

6.7856 

5.0718 

Mean  Heading 

-.1843 

2.2824 

.6084 

5.4465 

RMS  Heading 

2.1827 

1.4071 

5.1275 

3.6841 

Mean  Altitude 

-21.6591 

77.8214 

-71.5790 

316.2510 

RMS  Altitude 

67.5192 

67.3319 

193.8939 

322.1645 

Mean  Climb  Rate 

.1610 

1.4210 

.9480 

2.7849 

RMS  Climb  Rate 

2.3482 

.8100 

4.4623 

1.5058 

Mean  Bank 

1.4191 

.8855 

4.1372 

3.5044 

Mean  Stick  Force 

-.9480 

1.9727 

-1.0806 

2.0091 

RMS  Stick  Force 

2.4354 

1.7274 

2.8486 

2.3823 

RMS  Stick  Movement 

.1847 

.1009 

.2778 

.1514 

RMS  Throttle  Movement 

.6668 

.4829 

.7679 

.3872 

RMS  Fore-Aft  Stick  Position 

2.0347 

.6661 

2.1971 

1.1259 

RMS  Lateral  Stick  Position 

.2517 

.1356 

23968 

.2294 

Pitch  Rate 

1.0769 

.9846 

2.3859 

2.4821 

Pitch  Acceleration 

18.2833 

23.9736 

22.1778 

18.4969 

Roll  Rate 

.5146 

.3078 

.5749 

.3257 

Roll  Acceleration 

.5670 

.3494 

.5420 

.3432 

Vertical  Velocity 

172.3421 

208.9697 

409.9243 

825.0552 

Vertical  Acceleration 

1.5010 

.7557 

1.7047 

1.2598 

Second,  the  data  indicate  that  instructor  evaluations  are  a  usable  criterion  for  future  measurement 
system  development  The  relatively  high  levels  of  agreement  between  the  cockpit  and  console  ratings  are 
particularly  encouraging  since  the  availability  of  cues  was  radically  different  for  each  rater.  The  console 
rater  only  had  access  to  the  repeater  instruments  while  the  instructor  in  the  cockpit  could  observe  the 
students’  behavior  in  addition  to  the  flight  instruments.  Furthermore,  kinesthetic  cues  were  also  available  to 
the  cockpit  rater.  The  importance  of  these  different  cue  sources  in  instructor  evaluations  is  an  area  which 
should  be  addressed  in  future  research  studies. 

Third,  the  objectively  derived  measures  were  shown  to  possess  a  certain  degree  of  validity.  Significant 
correlations  between  these  measures  and  instructor  evaluations  were  obtained.  Furthermore,  the  measures 
were  shown  to  discriminate  between  pilots  of  different  experience  levels.  These  results  are  most 
encouraging  and  convincingly  demonstrate  the  fruitfulness  of  the  present  approach. 

Fourth,  the  results  provide  some  insight  into  the  manner  in  which  instructors  assign  grades.  A 
hierarchical  model  seems  most  consistent  with  the  data.  To  receive  an  excellent  (E)  rating,  errors  o"  the 
critical  parameters  must  all  be  low.  As  the  quality  of  the  rating  decreases,  the  potential  for  different  errors 
increase.  In  fact,  it  is  possible  to  commit  error  involving  one  parameter,  control  the  others  quite  well,  and 
still  receive  a  low  evaluation.  This  suggests  that  both  the  number  and  degree  of  error  are  important.  The 
investigation  of  instructors  grading  strategies  is  another  prime  area  for  future  research. 

Fifth,  the  results  provide  preliminary  data  concerning  the  simplification  of  the  present  set  of 
parameters.  As  expected,  parameters  reflecting  performance  of  the  criterion  objectives  yielded  the  highest 
validity  coefficients.  However,  measures  of  mean  error  were  not  as  effective  as  root-mean-square  error. 
Furthermore,  a  number  of  the  proposed  measures  of  smoothness  did  not  produce  any  significant 
relationships.  Roll  rate,  roil  acceleration,  and  pitch  acceleration  were  not  related  to  either  instructor  ratings 
or  experience  level.  Likewise,  throttle  movements  were  not  found  to  be  important. 


valid  Jtf  KSa1  °ftU  ClPr  minary  invcsti«ations  are  encouraging.  The  demonstrated 

ofeseit  aom^rh  Fff  d^{0p€d  {oT  basic  instrument  maneuvers,  indicate  the  fruitfulness  of  the 

co  “rlex  maneuLEfI?  816  f°r  the  deve,0Pmcnt  of  performance  measures  for  more 

trata^J JSffT?.1*  ?^Cd  ?at  *he  reSU,ting  measnrement  system  will  meet  both  the  research  and 
training  needs  for  future  studies  to  be  accomplished  in  ASUPT. 


REFERENCES 


Meyer  R.P.,  Laveson,  JJ„  Weiasman,  N.S.,  &  Eddowes,  E.E.  Behavioral  taxonomy  of  undergraduate  min, 
training  tasks  and  skUls:  Executive  summary.  AFHRL-TR-74-33(l).  Williams  AFB^Ariz  :  Flying 
Training  Division,  Air  Force  Human  Resources  Laboratory,  December  1974.  6 


14 


